Generating synthetic observations for latent and observed variables for PBMC data with single-cell variational inference (scVI)

Here we employ scVI to generate synthetic observations for latent and observed variables for the PBMC data. We employ both the original approach, built on ordinary variational autoencoders (VAEs) and the variant with a linear decoder network (LDVAE).

We load the expression data and labels to inspect the structure of the data.

Then we load the data for usage in scVI.

Fit VAE

We fit a VAE to the data using some hyper-parameters.

We then extract latent space information to inspect what the model has learned.

We then draw samples for the latent and observed variables. Here we draw from the posterior distribution, meaning the distribution of latent variables, conditional on the observations, but we could also draw from the prior distribution.

We save the drawn samples for latent and observed variables for further investigation with log-linera models.

Fit LDVAE

Here the same as shown for the VAE above is repeated with the LDVAE, which lacks hidden layers in the decoder network for improved interpretability of the latent variables.